智能论文笔记

BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scale

Carl Poelking , Felix A. Faber , Bingqing Cheng

分类：机器学习

2021-12-04

我们向高吞吐量基准介绍了用于材料和分子数据集的化学系统的多种表示的高吞吐量基准的机器学习（ML）框架。基准测试方法的指导原理是通过将模型复杂性限制在简单的回归方案的同时，在执行最佳ML实践的同时将模型复杂性限制为简单的回归方案，允许通过沿着同步的列车测试分裂的系列进行学习曲线来评估学习进度来评估原始描述符性能。结果模型旨在为未来方法开发提供通知的基线，旁边指示可以学习给定的数据集多么容易。通过对各种物理化学，拓扑和几何表示的培训结果的比较分析，我们介绍了这些陈述的相对优点以及它们的相互关联。

translated by 谷歌翻译

MindBigData 2022 A Large Dataset of Brain Signals

David Vivancos , Felix Cuesta

分类：计算机视觉 | 机器学习

2022-12-27

Understanding our brain is one of the most daunting tasks, one we cannot expect to complete without the use of technology. MindBigData aims to provide a comprehensive and updated dataset of brain signals related to a diverse set of human activities so it can inspire the use of machine learning algorithms as a benchmark of 'decoding' performance from raw brain activities into its corresponding (labels) mental (or physical) tasks. Using commercial of the self, EEG devices or custom ones built by us to explore the limits of the technology. We describe the data collection procedures for each of the sub datasets and with every headset used to capture them. Also, we report possible applications in the field of Brain Computer Interfaces or BCI that could impact the life of billions, in almost every sector like healthcare game changing use cases, industry or entertainment to name a few, at the end why not directly using our brains to 'disintermediate' senses, as the final HCI (Human-Computer Interaction) device? simply what we call the journey from Type to Touch to Talk to Think.

translated by 谷歌翻译

Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography

Ilya Chugunov , Yuxuan Zhang , Felix Heide

分类：计算机视觉

2022-12-22

Modern mobile burst photography pipelines capture and merge a short sequence of frames to recover an enhanced image, but often disregard the 3D nature of the scene they capture, treating pixel motion between images as a 2D aggregation problem. We show that in a "long-burst", forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth. To this end, we devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion. Our plane plus depth model is trained end-to-end, and performs coarse-to-fine refinement by controlling which multi-resolution volume features the network has access to at what time during training. We validate the method experimentally, and demonstrate geometrically accurate depth reconstructions with no additional hardware or separate data pre-processing and pose-estimation steps.

translated by 谷歌翻译

A Memetic Algorithm with Reinforcement Learning for Sociotechnical Production Scheduling

Felix Grumbach , Nour Eldin Alaa Badr , Pascal Reusch , Sebastian Trojahn

分类：机器学习 | 人工智能

2022-12-21

The following article presents a memetic algorithm with applying deep reinforcement learning (DRL) for solving practically oriented dual resource constrained flexible job shop scheduling problems (DRC-FJSSP). In recent years, there has been extensive research on DRL techniques, but without considering realistic, flexible and human-centered shopfloors. A research gap can be identified in the context of make-to-order oriented discontinuous manufacturing as it is often represented in medium-size companies with high service levels. From practical industry projects in this domain, we recognize requirements to depict flexible machines, human workers and capabilities, setup and processing operations, material arrival times, complex job paths with parallel tasks for bill of material (BOM) manufacturing, sequence-depended setup times and (partially) automated tasks. On the other hand, intensive research has been done on metaheuristics in the context of DRC-FJSSP. However, there is a lack of suitable and generic scheduling methods that can be holistically applied in sociotechnical production and assembly processes. In this paper, we first formulate an extended DRC-FJSSP induced by the practical requirements mentioned. Then we present our proposed hybrid framework with parallel computing for multicriteria optimization. Through numerical experiments with real-world data, we confirm that the framework generates feasible schedules efficiently and reliably. Utilizing DRL instead of random operations leads to better results and outperforms traditional approaches.

translated by 谷歌翻译

SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks

Suwon Shon , Siddhant Arora , Chyi-Jiunn Lin , Ankita Pasad , Felix Wu , Roshan Sharma , Wei-Lun Wu , Hung-Yi Lee , Karen Livescu , Shinji Watanabe

分类：自然语言处理

2022-12-20

Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community, but have not received as much attention as lower-level tasks like speech and speaker recognition. In particular, there are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers. Recent work has begun to introduce such benchmark datasets for several tasks. In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. We contribute four tasks: question answering and summarization involve inference over longer speech sequences; named entity localization addresses the speech-specific task of locating the targeted content in the signal; dialog act classification identifies the function of a given speech utterance. We follow the blueprint of the Spoken Language Understanding Evaluation (SLUE) benchmark suite. In order to facilitate the development of SLU models that leverage the success of pre-trained speech representations, we will be publishing for each task (i) annotations for a relatively small fine-tuning set, (ii) annotated development and test sets, and (iii) baseline models for easy reproducibility and comparisons. In this work, we present the details of data collection and annotation and the performance of the baseline models. We also perform sensitivity analysis of pipeline models' performance (speech recognizer + text model) to the speech recognition accuracy, using more than 20 state-of-the-art speech recognition models.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

A novel feature-scrambling approach reveals the capacity of convolutional neural networks to learn spatial relations

Amr Farahat , Felix Effenberger , Martin Vinck

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-12

Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition. Furthermore, CNNs have major applications in understanding the nature of visual representations in the human brain. Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans. Specifically, there is a major debate about the question of whether CNNs primarily rely on surface regularities of objects, or whether they are capable of exploiting the spatial arrangement of features, similar to humans. Here, we develop a novel feature-scrambling approach to explicitly test whether CNNs use the spatial arrangement of features (i.e. object parts) to classify objects. We combine this approach with a systematic manipulation of effective receptive field sizes of CNNs as well as minimal recognizable configurations (MIRCs) analysis. In contrast to much previous literature, we provide evidence that CNNs are in fact capable of using relatively long-range spatial relationships for object classification. Moreover, the extent to which CNNs use spatial relationships depends heavily on the dataset, e.g. texture vs. sketch. In fact, CNNs even use different strategies for different classes within heterogeneous datasets (ImageNet), suggesting CNNs have a continuous spectrum of classification strategies. Finally, we show that CNNs learn the spatial arrangement of features only up to an intermediate level of granularity, which suggests that intermediate rather than global shape features provide the optimal trade-off between sensitivity and specificity in object classification. These results provide novel insights into the nature of CNN representations and the extent to which they rely on the spatial arrangement of features for object classification.

translated by 谷歌翻译

Deep Learning-based Anonymization of Chest Radiographs: A Utility-preserving Measure for Patient Privacy

Kai Packhäuser , Sebastian Gündel , Florian Thamm , Felix Denzinger , Andreas Maier

分类：计算机视觉 | 机器学习

2022-09-23

出于研究目的，在发布大量此类数据集之前，胸部X光片的强大而可靠的匿名化构成了必不可少的步骤。传统的匿名过程是通过在图像中使用黑匣子中遮盖个人信息并删除或替换元信息来执行的。但是，这种简单的措施将生物识别信息保留在胸部X光片中，从而使患者可以通过连锁攻击重新识别。因此，我们看到迫切需要混淆图像中出现的生物特征识别信息。据我们所知，我们提出了第一种基于深度学习的方法，以目标匿名化胸部X光片，同时维护数据实用程序以诊断和机器学习目的。我们的模型架构是三个独立神经网络的组成，当共同使用时，它可以学习能够阻碍患者重新识别的变形场。通过消融研究研究每个组件的个体影响。 CHESTX-RAY14数据集的定量结果显示，在接收器操作特征曲线（AUC）下，患者重新识别从81.8％降低至58.6％，对异常分类性能的影响很小。这表明能够保留潜在的异常模式，同时增加患者隐私。此外，我们将提出的基于学习的深度匿名方法与差异化图像像素化进行比较，并证明了我们方法在解决胸部X光片的隐私性权衡权衡方面的优越性。

translated by 谷歌翻译

A Note on the Efficient Evaluation of PAC-Bayes Bounds

Felix Biggs

分类：机器学习 | (统计)机器学习

2022-09-12

当利用Pac-Bayes理论进行风险认证时，通常有必要估计和约束Pac-Bayes后部风险。文献中的许多作品采用了一种方法，需要大量数据集，从而产生高计算成本。该手稿提出了一种非常通用的替代方案，可在数据集大小的顺序上节省计算。

translated by 谷歌翻译

Augmented cross-selling through explainable AI -- a case from energy retailing

Felix Haag , Konstantin Hopf , Pedro Menelau Vasconcelos , Thorsten Staake

分类：机器学习 | 人工智能

2022-08-24

机器学习的进步（ML）引起了人们对这项技术支持决策的浓厚兴趣。尽管复杂的ML模型提供的预测通常比传统工具的预测更准确，但这种模型通常隐藏了用户预测背后的推理，这可能导致采用和缺乏洞察力。在这种张力的激励下，研究提出了可解释的人工智能（XAI）技术，这些技术发现了ML发现的模式。尽管ML和XAI都有很高的希望，但几乎没有经验证据表明传统企业的好处。为此，我们分析了220,185家能源零售商的客户的数据，预测具有多达86％正确性的交叉购买（AUC），并表明XAI方法的Shap提供了为实际买家提供的解释。我们进一步概述了信息系统，XAI和关系营销中的研究的影响。

translated by 谷歌翻译